Educative: Interactive Courses for Software Developers

Books, articles, and programming frameworks have established a cultural convention that every database table must have a primary key column with the following characteristics:

The primary key’s column name is id.
Its data type is a 32-bit or 64-bit integer.
Unique values are generated automatically.

The presence of a column named id in every table is so common that this has become synonymous with a primary key. Programmers learning SQL get the false idea that a primary key always means a column defined in this manner.

Enter to Rename, Shift+Enter to Preview

Creating the Bugs table

Adding an id column to every table causes several effects that make its use seem arbitrary.

Making a redundant key#

We may sometimes see an id column defined as the primary key simply for the sake of tradition, even when another column in the same table could be used as the natural primary key. The other column may even be defined with a UNIQUE constraint. For example, in the Bugs table, we could label bugs using a string with a mnemonic for the project the bug belongs to or other identifying information.

Enter to Rename, Shift+Enter to Preview

Making a redundant key

Let’s run the query in the following playground to see the effect on the database.

Enter to Rename, Shift+Enter to Preview

Retrieving data after making a redundant key

The bug_id column in the example has similar usage to id in that it serves to identify each row uniquely.

Allowing duplicate rows#

A compound key consists of multiple columns. One typical use for a compound key is in an intersection table like BugsProducts. The primary key should ensure that a given combination of values for bug_id and product_id appears only once in the table, even though each value may appear many times in different pairings.

However, when we use the mandatory id column as the primary key, the constraint no longer applies to two columns that should be unique.

Enter to Rename, Shift+Enter to Preview

Creating the BugsProducts table

Let’s try to run the code in the playground below.

Enter to Rename, Shift+Enter to Preview

Retrieving data from the BugsProducts table

We’ll find that there are 3 records for the same bug_id and the product_id. This means that these records are duplicated.

Duplicates in this intersection table cause unintended results (because of duplicated records) when we use the table to match Bugs with Products. To prevent duplicates, we can declare a UNIQUE constraint over the two columns besides id:

Enter to Rename, Shift+Enter to Preview

Creating BugsProducts table with constraint

But if we need a unique constraint over those two columns anyway, the id column is superfluous.

Obscuring the meaning of the key#

The word “code” has a number of definitions, one of which is a way to communicate a message with brevity or secrecy. In programming, we should have the opposite goal — to make the meaning clearer.

The name id is so generic that it holds no meaning. This is especially important when we join two tables, and they have the same primary key column name.

Enter to Rename, Shift+Enter to Preview

Joining two columns having the same primary key name

How do you tell the bug id from the account id in your application code if you reference columns by name instead of by ordinal position? This is a problem, especially in dynamic languages like PHP, when a query result is an associative array: one column overwrites the other unless you specify column aliases in your query.

The name of the id column doesn’t help make the query any clearer. If, instead, the columns were named bug_id and account_id, the reader would have a much easier time reading the query results. We use a primary key to address individual rows of a table, so the column’s name should give a clue about the type of entity in that table.

Using `USING`#

A lot of us are probably already familiar with the SQL syntax for a join — using the keywords JOIN and ON before an expression to evaluate matching rows in the two tables.

Let’s try to run this query in the next playground.

Enter to Rename, Shift+Enter to Preview

Using JOIN and ON in a query

SQL also supports a more concise syntax for expressing a join between two tables. We can rewrite the previous query in the following way if the columns have the same name in both tables:

Enter to Rename, Shift+Enter to Preview

Using the USING in a query

However, if all tables are required to define a pseudo key as a primary key named id, then a foreign key column in a dependent table can never use the same name as the primary key it references. Instead, we must always use the more verbose ON syntax:

Enter to Rename, Shift+Enter to Preview

Using JOIN and ON in a query

Compound keys are hard#

Some developers refuse to use compound keys because they feel that these keys are too hard to use. Any expression that compares a key to another must compare all columns. A foreign key that references a compound primary key must itself be a compound foreign key. It requires more typing to use compound keys.

This refusal is like a mathematician refusing to use two-dimensional or three-dimensional coordinates, and insisting on performing all calculations as though objects exist within a one-dimensional, linear space. It’s true that this would make a lot of geometry and trigonometry much simpler, but it fails to describe real-world objects that we need to work with.

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Antipattern: One Size Fits All

Making a redundant key#

Allowing duplicate rows#

Obscuring the meaning of the key#

Using `USING`#

Compound keys are hard#

Antipattern: One Size Fits All

Making a redundant key#

Allowing duplicate rows#

Obscuring the meaning of the key#

Using USING#

Compound keys are hard#

Using `USING`#